Learning Distributions by their Density Levels

نویسندگان

  • Shai Ben-David
  • Michael Lindenbaum
چکیده

We propose a mathematical model for learning the high-density areas of an unknown distribution from (unlabeled) random points drawn according to this distribution. While this type of a learning task has not been previously addressed in the Computational Learnability literature, we believe that this it a rather basic problem that appears in many practical learning scenarios. From a statistical theory standpoint, our model may be viewed as a restricted instance of the fundamental issue of inferring information about a probability distribution from the random samples it generates. From a computational learning angle, what we propose is a new framework of un-supervised concept learning. The examples provided to the learner in our model are not labeled (and are not necessarily all positive or all negative). The only information about their membership is indirectly disclosed to the student through the sampling distribution. We investigate the basic features of the proposed model and provide lower and upper bounds on the sample complexity of such learning tasks. Our main result is that the learnability of a class of distributions in this setting is equivalent to the niteness of the VC-dimension of the class of the high-density areas of these distributions. One direction of the proof involves a reduction of the density-level-learnability to p-concepts learnability, while the suuciency condition is proved through the introduction of a generic learning algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparsion Between Several Distributions of Exponential Family and Offering Their Features and Applications

‎In this paper‎, ‎first‎, ‎we investigate probability density function and the failure rate function of some families of exponential distributions‎. ‎Then we present their features such as expectation‎, ‎variance‎, ‎moments and maximum likelihood estimation and we identify the most flexible distributions according to the figure of probability density function and the failure rate function and f...

متن کامل

Learning distributions by their density-levels - a paradigm for learning without a teacher

We propose a mathematical model for learning the high density areas of an un known distribution from unlabeled random points drawn according to this distri bution While this type of a learning task has not been previously addressed in the Computational Learnability literature we believe that this it a rather basic problem that appears in many practical learning scenarios From a statistical theo...

متن کامل

Transparent Machine Learning Algorithm Offers Useful Prediction Method for Natural Gas Density

Machine-learning algorithms aid predictions for complex systems with multiple influencing variables. However, many neural-network related algorithms behave as black boxes in terms of revealing how the prediction of each data record is performed. This drawback limits their ability to provide detailed insights concerning the workings of the underlying system, or to relate predictions to specific ...

متن کامل

Approximating the Distributions of Singular Quadratic Expressions and their Ratios

Noncentral indefinite quadratic expressions in possibly non- singular normal vectors are represented in terms of the difference of two positive definite quadratic forms and an independently distributed linear combination of standard normal random variables. This result also ap- plies to quadratic forms in singular normal vectors for which no general representation is currently available. The ...

متن کامل

Modeling Magnetic Field in Heavy ion Collisions Using Two Different Nuclear Charge Density Distributions

By studying the properties of matter during heavy-ion collisions, a better understanding of the Quark-Gluon plasma is possible. One of the main areas of this study is the calculation of the magnetic field, particularly how the values of conductivity affects this field and how the field strength changes with proper time. In matching the theoretical calculations with results obtained in lab, two diffe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997